DOCOMENT SESOME 

CG Om 695 

Sauser, William Jr, 

A Coraparlson of the Effects of Eater Training and 
Participation on Sources of variance m a Set of BASS 
RatiuQs, 
Her 80 

17p*; Paper presented at the Annual Meeting of the 
Southeastern Psychological Association (26th , 
Washington, DC, Harch 26-29, 19S0) • 

MFOI/FCOI Plus Postage, 

* Behavior Eating Scales; Com pa rati ve Analysis: 
Participation : Psychopetrics; * Student Evaluation of 
Teacher Performance: Student Teacner Relationship: 
Teacher Effectiveness : Teaching Styles ; *Test 
Construction : *Test validity: ^Training 
variance (Statist ica 11 



The effects of training and participation on sources 
of variance in a set of ratings of college classroom teacnir^g 
effectiveness were coBipared, College students (13=96) were randomly 
assigned to four cells of the experimental design, subjects in cells 
(Al and (B) participated in the construction of a set of 
behaviors lly-anchored rating scales (BARS! of five aspects of college 
classroom teaching performance, while subjects in cells (C) and (D) 
perforiDed a control task. Later, subjects in cells (A) and (C) were 
exposed tc a rater training program, while subjects in ceils (B) and 
(D) performed a control task* All subjects then evaluated five 
standardized simulated professors using the BAES, Training 
significantly reduced the overall elevation cf the ratings; 
participation did not. Neither participation nor training 
significantly reduced the variance attributable to the category of 
behavior being evaluated. Both participation and training 
significantly reduced variance attributable to the professor being 
rated. Participation significantly increased the Category x Professor 
effect while training did not. There were no significant interactions 
amcng the treatments with regard to effects on any of the a£»ove 
characteristics of ratings. Findings suggest that, for these four 
characteristics of ratings, participation and training operate 
independently cf each other, (Author^ 



ED 193 S^^ 

AOTHOE 
TIILE 

POB DATE 
NOTE 

EDBS PSICE 
DESCKIPTOSS 

IDEKTIFIEFS 
ABSTEACT 



A 3|(:#t#t#lK ************** ************ ****** *****^***********V*** 3>##3># ****** 

* Peproductions supplied by EDES are the best that can be made * 

* from the original document, * 

^^*3^:4t}0c9tt«*** ******************«*:*(** ******************************* ****** 



ERIC 



lLr^ 



o 
ui 



A Comparison of the Effects of Rater Training and 
Participation on Sources of Variance in 
a Set of BARS Ratings 

William I, Sauser^ Jr. 
Auburn University 

Paper presented at the meeting of the 
Southeastern Psychological Association 



Washington^ D.C. 



March 19?^^ 



in 



o 



^PERMISSION TO REPflOOUCE THIS 



EDUCATION £ WELr^AftE 
NATIONAL INSTITUTE OP 
eOUCATlON 




THtS DOCUVEhf MAS efEN t*£PQO 
C^JCEO EXA<;UY AS «ECE*VCO ^00^^ 

flriN'/if POthf S Of v»e\^0« Of^ihtCNS 
STATED OO NOT heC£SSAttn y ut PWE- 
S^t^f r If lAt NAriONAi iNSf If or 
eoyr flTkON PO^tfkON Ofif (jOi ifv ' 



2 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER {mCV 



1 

Purpose 

Ihe inost popular criterion measure employed in personnel research 
and practice today is the rating scale (Blum ft Naylor, 1968, pp. 197- 
198), Despite its popularity, the rating method has been severely 
criticised due to questionable levels of reliability and validity 
(Ronan & Schwartz, 1974) and susceptibility to **rating errors** such 
as leniency, central tendency, and halo (Smith, 1976), While several 
techniques have been used in attempts to improve the quality of 
ratings as criteria, the two approaches found generally most suc- 
cessful are rater training (Guilford, 1954, p. 280; Latham, Wexley, 
& Pursell, 1975) and rater participation in scale construction 
(Campbell, Dunnette, Arvey, & Hellervifc, 1973; Smith & Kendall, 1963), 
The industrial/organisational psychology literature contains numerous 
studies of the effectiveness of these two approaches, yet direct 
comparisons of their effects on psychometric characteristics of 
ratings are scarce. The purpose of this study was to directly 
compare the effects of training and participation on sources of 
variance in a set of ratings of college classroom teaching effec- 
tiveness . 



ERIC 



3 



Method 



Nlnety-si3c undergraduate students taking courses In psychology 
at a large southeastern university were randomly assigned to four 
cells of the experimental design; (a) Both Participation and 
Training, (b) Participation Only, (c) Training Only, and (d) 
Neither Participation nor Training. Subjects in cells (a) and (b) 
participated in the construction of a set of behavlorally anchored 
rating scales (BARS) for measuring five aspects of college class- 
room teaching performance, while subjects in cells (c) and (d) 
performed a control task. Later, subjects in cells (a) and (c) 
were exposed to a rater training program, while subjects in cells 
(b> and (d) performed a control task. All subjects then evaluated 
five standardizJ^d simulated professors using the BARS. These 
**simiilated professors*' consisted of short biographical descriptions 
followed by behavioral diaries containing scaled incidents obtained 
during tne BARS construction process. 



3 



Table 6- Definitions of the Five Categories of 
College Classroom Teaching Behavior 



A- Relationships with Students . This category refers to the way the 
professor treats his/her students both in and out of class. It 
includes such things as talking with students before, during^ and 
after class, interacting with and counseling students in the office 
and elsewhere regarding course-related and personal probleins, kn^w*- 
ing students* names, and treating students with respect in class. 

B. Ability to Present the liaterial . This category refers to the way 
the professor organizes the material and presents it to the class* 
It includes such things as coming to class well-prepared and on 
time, organizing the material in a logical manner, speaking and 
writing clearly, and using examples, audio'-visual aids, and other 
devices to get the material across to the students, 

C. Interest in Course and Material , This category refers to the pro- 
fessor*;; knowledge of and interest in the material he/she is try- 
ing to teach. It includes such things as being able to answer 
questions and elaborate on the material, shoxd^ng enthusiasm for the 
course, and reading and researching to keep current and learn more 
about the subject matter, 

D. Reasonableness of the Workload . This category refers to the amount 
of work (reading, homework problems, class and lab work, papers, 
testG, etc) assigned by the professor. It includes such things 

as clearly specifying assignments and due dates, scheduling the 
work evenly throughout the quarter, and keeping the workload appro- 
priate to the credit-hour value of the course. 

Fairness of Testing and Grading , This category refers to the fair- 
ness of the professor*s testing and grading policies. It includes 
such things as stating how grades are to be determined, testing 
over appropriate material, and grading without bias. 



ERiC 



5 



4 



APPENDIX F-1 (Cont'd) 



!>• R«eson&bl«n«30 of th9 Workload 



This dimension refers to th& amount of woric (readingi homework 
problemsi clase and lab worki pap^rsi tests* etc*) assl^eJ by 
the professor* It Includes such things cieiirly specifying 
assignjcents and due dategi scheduling the work evenly through- 
out the quarter, and keeping the tiforkload appropriate to the 
credit-hour value of the course. 



Best 
Possible 



Bxactli_i 
Neutral 



worst ^ 
PossiOld^ 



This professor could b« axpected to discontinue or 
reduce homework aSsLgitments around mLdterjns ^d fljials 
so that his students would have mora time to study* 

this professor could expected to distribute the 
workload evenly across the Quarter. 

This professor could te expected to assign reasonable 
amounts of homework every other day* 



This profsssor could bt sxpected to assign homework a 
few tiiaes a week but not every day* 



This professor could bo expected to assign a four-*to- 
five page typewritten paper and specify the forntat and 
style in **ich it u to be written. 



This professor could be expected to assign about fifty 
pages ot reading por week. 



This professor could be expected to require a teria paper* 
oral presentation, and weekly tests* 



This professor could be expected to require a lot ot 
oetnoritation for his class* 



This professor coultj be expected sometimes to assign two 
chapters for one night's assignment* 



- ^ This professor could be expected to surprise her students 
with an extra aasignraent toward the end of the quarter. 



This professor could be expected twice to assign five 
page papers two days before they are due* 



6 



APPENDIX G-1 



PROFESSOR L 

Professor L Is ^ 2?-year*old ti^le Asatstant Professor who Is 
new at Auburn* K« has long red h^ir, a full beanl and tnoustachei 
and is a he^vy sotofeer* H& usually wears Jeans and fXannel shirts* 
toots, and a blatk leather jacket to class, He is not very well 
known in his field but has initiated a ntimber of research projects 
since arriving at Auburn.. He teaches a ^-hour^ 300-l«vel science 
course with a laboratory* 

You observed the fcllowing things about Professor L while taking 
his course! 

He used a variety of methods to present th4 inaterial* including 
fiXnSi tapes* and experiments.. 

K« told the class he would ^rad^ on a 10-point scale* then 
actually used a 7-point scale to assign final grades.. 

He often described his own fascination with the material he 
was covering* 

Me gava a inid-term and final only* 

He assigned only as much homework as k^s necessary to learn 
the material thoroughly* 

He was attentive and helpful In class* but \f^s generally 
unavailable for outsids helpi 

He gave plenty of tiioe to read the o^atericil and discussed it 
thoroughly In class.. 

Once when aske^l a question in class he lost patience "1th 
himself because he could not answer it* 

He always left promptly after giving hlo lectures.. 

When asked by his students >fhat to study for a t«st* he said, 
*I don't know, I haven*t mde It out yet-" 

He did not curve grades even If the average score waf! Ln the 
50s or £Osi 

He gave a student uncleo.r arid evasive answers to her q*iestiona 
when She visited his office.. 

His lectures were borLng and unorganized.. 

He assigned about t*o hours worth of work to be done during 
his three-hour laboratory ao that ho one would nave rush* 



ERIC 



7 



APPEiroiX G-1 CContM) 



He took l«cture9 straight froa th« book and nev«r £ave 
examples^ 

H« often told the clefts about interesting articles he had 
read or experiments had heard abcut^ 

Although he gave his office number and hours cn tht first day 
of class* he dJd not encourage the students to cone see him^ 

Once when confounded by a student's <iuestion in class he spent 
several hours of hia own tlae that afternoon researching caterial 
for an answer^ 

He reduced the wrkload at the end of the quarter when he 
realised that his students did not have enough ticie to coaiplete 
all of the assignttentSt 

He jiought stuJent input to support his conclusions in claad^ 




ERIC 



7 

Table 1* Scale Values In the Simulated 
Professor x Category Matrix 



Simulated Professor 

Row Row Row 

Category L M NOP Sum Mean Variance 

A A.O 10.0 8.0 2.0 6.0 30.0 6.0 8.0 

B 6.0 2.0 10.0 8.0 30.0 6.0 8.0 

C 8.0 A. 0 6.0 10.0 2.0 30.0 6.0 8.0 

D 9.6 6.0 2.0 8.0 4.0 29.6 5.9 7.4 

E 2.0 8.0 4.0 6.1 10.0 30.1 6.0 8.0 

Column SuTc 29.6 30.0 30.0 30.1 30.0 

Column Mean 5.9 6.0 6.0 6.0 6.0 



Column Variance 7.4 8.0 8.0 8.0 8.0 



4 



8 

Analysis 

The data were analyzed in a split-plot factorial ANOVA with 
Participation and Training (two levels each) serving as between- 
subjects factors and Categories (of performance) and Professors 
(five levels each) as within-subjects factors. Additional analyses 
were performed to interpret various significant interactions among 
the factors. The omega^square statistic was employed to determine 
the practical significance of statistica31y significant effects. 



ERLC 



10 



9 



Table 8. Study One ANOVA Table— Axi Subjects 



Source df SS F 



Participation 




1 


1 . 6964 


0.77 


— 


Training 




1 


31.4683 


14.35* 


.0013 


Part X Train 




1 


0.4991 


0.23 


— 


Subjects w» groups 




4 


8.7720 


0.63 


— 


Categories 




4 


314.3566 


22.57*** 


.0145 


Part X Cat 




4 


27.6427 


1.99 




Train x Cat 




4 


7.5794 


0.54 




Part X Train x Cat 




4 


17.2175 


1.24 




Cat X Subj w. grp 




16 


82.7864 


1.49 




Professors 




4 


67.1426 


4.82*** 


.0026 


Part X Prof 




4 


47.9253 


3.44** 


.0016 


Train x Prof 




4 


34.0416 


2.44* 


.0010 


Part X Train x Prof 




4 


20.5515 


i.48 




Prof X Subj w. grp 




16 


54.7207 


0.98 




Cat X Prof 




16 


12071.2852 


216.72*** 


.5786 


Part X Cat X Prof 




16 


102.4909 


1.84* 


.0023 


Train x Cat x Prof 




16 


79.9003 


1.43 




Part X Train x Cat x 


Prof 


16 


26.9005 


0.48 




Cat X Prof X Subj w» 


grp 


64 


167.8245 


0.75 




Residual 




2183 


7599.6374 






Total 




2382 


20764.4391 







All effects were tested against Residual except for Participation, 
Training, and Part x Train, which were tested against Subjects w. groups. 

*£ < .05 

**£ < .01 

***p < .001 



11 



10 

Table 9. Study One AMOVA Table — Participant Subjects Only 



Source df SS F o)^ 



Tra inlng 


1 


1 1 . 6662 


3.21 




Subjects V, groups 


2 


7.2767 


1.18 




Categories 


4 


156.9696 


5.37* 


.0121 


Train x Cat 


4 


9.8173 


0.34 




Cat. X Subj w. grp 


8 


58.4766 


2.36* 


.0032 


Professors 


4 


25.5242 


2.06 




Train x Prof 


4 


47.1598 


3.81** 


.0033 


Prof X Subj w. grp 


8 


14.4242 


0.58 




Cat X Prof 


16 


6661.7041 


134.54*** 


.6279 


Train x Cac x Prof 


16 


57.5041 


1.16 




Cat X Prof X Subj w, grp 


32 


122.3188 


1.24 




Residual 


1084 


3354.7129 






Total 


1183 


10527.5546 







^All effects were tested against Residual except for Categories 
and Train x Cat, wnich were tested against Cat x Subj w. grp; and 
Training, which was tested against Subjects w. groups. 

*£ < .05 

**£ < .01 

***£ < .001 



12 



11 



Table 10. Study One ANOVA Tab le--Non-partici pant Subjects Only 



Source df SS w' 



Training 


1 


20.4132 


27.30* 


.0016 


Subjects w. groups 


2 


U4953 


0.19 




Categories 




184.8966 


11.97*** 


.0165 


Train x Cat 


4 


15.4290 


1.00 




Cat X Subj w. grp 


8 


24.3098 


0.79 




Professors 


4 


89.2225 


5.77*** 


.0072 


Train x Prof 


4 


7.4333 


0.48 




Prof X Subj w. grp 


8 


40.2965 


1.30 




Cat X Prof 


16 


5511.2407 


89.18*** 


.5322 


Train x Cat x Prof 


16 


50.0515 


0.81 




Cat X Prof X Subj w. grp 


32 


45.5057 


0.37 




Residual 


1099 


4244.9245 






Total 


1198 


10235.2187 







^All effects were tested against Residual except for Training, 
which was tested against Subjects w. groups. 

*£ < .05 

***£ < .001 



13 



12 



Table 11* Study One AlflOVA Table— Trained Subjects Only 



Source df SS w 



Participation 


1 


2.1316 


3.69 




Subjects w. groups 


2 


K1542 


0.17 




Categories 


4 


179.2890 


5.74* 


.0148 


Part X Cat 


4 


34.3782 


1. 10 




Cat X Subj w. grp 


8 


62.4300 


2.27* 


.0035 


Professors 


4 


33.1986 


2.42* 


.0019 


Part X Prof 


4 


50.7''07 


3.70** 


.0037 


Prof X Subj w. grp 


8 


35.1496 


1.28 




Cat X Prof 


16 


5706.6370 


103.97*** 


.5643 


Part X Cat X Prof 


16 


56.1690 


1.02 




Cat X Prof X Subj w. grp 


32 


100.8442 


0.92 




Residual 


1093 


3749.4051 






Total 


1192 


10011.5572 







All effects were tested against Residual except for Categories 
and Part x Cat, which were tested against Cat x Subj w* grp.; and 
Participation, which was tested against Subjects w* groups. 

*£ < *05 

**£ < *01 

< *001 



14 



13 

Table 12. Study One ANOVA Table — Untrained Sub-ieccs Only 



Source 


df 


SS 


I 


2 


Participation 


1 


0.1454 


O.QA 




Subjects w» groups 


2 


7.6i78 


KOf 




Categories 


4 


142.5991 


10. 09*** 


.0120 


Part X Cat 


4 


10.9008 


0.77 




Cat X Subj w. grp 


8 


20.3564 


0.72 




Professors 


4 


68.0058 


4.81*** 


.0050 


Part X Prof 


4 


17.3102 


1.23 




Prof X Subj w. grp 


8 


19.5712 


0.69 




Cat X Prof 


16 


6444.4720 


114.03*** 


.5936 


Part X Cat x Prof 


16 


73.2224 


1.30 




Cat X Prof X Subj w. grp 


32 


66.9803 


0.59 




Residual 


1090 


3850.2323 






Total 


1189 


1072K4136 







All effects were tested against Residual except for Participation, 
which was tested against Subjects w. groups. 

***2 < .001 



ERIC 



15 



14 

Conclusions 

Ihe major findings vere: (l) Training significantly reduced 
the overall elevation of the ratings, whereas participation did not. 

(2) Neither participation nor training significantly reduced the 
variance attributable to the category of behavior being evaluated- 

(3) Both participation and training significantly reduced variance 
attributable to the professor being rated- (4) Participation signi- 
ficantly increased the Category x Professor effect (discriminant 
validity) while training did not- (5) There were no significant 
interactions among the treatments with regard to effects on any of 
the above characteristics of ratings. Thus, it appears that partici- 
pation and training operate independently of each other, at least as 
far as these four characteristics of ratings are concerned- 



ERIC 



16 



Refe rences 



15 



Blum^ M.l.*, & Naylor, J*C. Industrial psychology: Its theoretical 
and social fotmdations . New York: Harper & Row> 1968. 

Campbell, J.P*, Dunnette, M,D., Arvey, R*D. , 6 Hellervik, L,V, The 
developaient and evaluation of behaviorally based rating scales. 
Journal of Applied Fsycholofiy , 1973, 57, 15-22. 

Guilford, J,P* Psychometric methods (2nd ed.)* New York: McGraw- 
Hill, 195A* 

Latham, G.P+, Wexiey, K.tJ+, & Pursell, E + D+ Training managers to 
minimize rating errors in the observation of behavior* Journal 
of Applied Psychology . 1975, 60, 550-555* 

Ronan, W+W., & Schwartz, A.P, Ratings as performance criteria* 
International Review of Applied Psycholo^ * 197A, 2^, 71-82. 

Smith, P.C. Behaviors, results, and organizational effectiveness: 
The probler; of criteria. In M + D. Dunnette (Ed.), Handbook of 
industrial and organizational psycholoRy * Chicago: Rand McNally, 
1976. 

Smith, P.C., 6 Kendall, L*M, Retranslatlon of expectations: An 
approach to the construction of unambiguous anchors for rating 
scales. Journal of Applied Psychology , 1963, V7, 1A9-*155- 



17 



